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DESCRIPTION 

The SSI 263A Is a versatile, high-quality, phoneme- 
based speech synthesizer circuit contalned In a single 
monolithic CMOS integrated circuit. It is designed to 
produce an audio output of unlimited. vocabulary, 
music and sound effects at an extremely low data In- 
put rate. 


Speech Is synthesized by combining phonemes, the 
building blocks of speech, in an appropriate sequence. 
The SSI 263A contains five eight-bit registers that allow 
software contro! of speech rate, pitch, pitch movement 
rate, amplitude, articulation rate, vocal tract filter 
response, and phoneme selection and duration. 
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FEATURES 

¢ Single low-power CMOS Integrated circult 

e 5 Volt supply 

* Extremely low data rate 

° 8-bit bus compatible with selectable handshaking 
modes 

* Non-dedicated speech, Ideal for text-to-speech 
programming 

¢ Programmable and hard powerdowni/reset mode 

« Switcned-capacitor-filter technology 
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SSI 263A Pin Out 
(Top View) 


SSI 263A 


SS! 263A Operation Description 


This short description is intended to provide SSI 263A 
feature and capability information only. Refer to the 
SSI 263A USERS GUIDE for complete information on 
application and phonetic programming. 


The Production of Speech 

To produce different speech phonemes (sounds) the 
SS! 263A uses a model of the human vocal tract. Within 
the device this analog tract is modeled with five 
cascaded programmable low pass filter sections. The 
filter sections are programmed internally by a digital 
controller. Elther a glottal (pitch) or a pseudo-random 
nolse source Is used to excite the vocal tract, depen- 
ding on whether a voiced or non-voiced phoneme Is 
selected. During speech production the phonemes will 
typically last between 25 and 100 mS. 


The Speech Attribute Registers 

Speech Is produced by programming speech attribute 
(character\stic) data Into five eight-bit registers. These 
internal registers allow selection of phonemes and 
speech characteristics. Refer to the Register Input 
Formats for the functional allocations. 


Device Response to Attribute Register Data 

The SSI 263A has two general classes of attribute data: 
“control” data (speech rate, filter frequency, phoneme 
articulation rate, phoneme duration, immediate Inflec- 
tlon setting, and Inflection movement rate) and “target” 
data (phoneme selection, audio amplitude, and transl- 
tloned Inflection). The SS] 263A responds Immediately 
upon loading “control” data; upon loading “target” 
data the device will begin to move towards that target 
at the prescribed transition rates. This fully Internal 
linear transitioning between target values, done ina 
manner as |s found In normal speech, is a key factor In 
reducing control data rate without sacrificing speech 
quality. 


Attribute Register Writing 

The elght bit data bus D7-D0 loads the particular 
attribute register selected by the three bit address bus 
RS2-RSO0, To write the data, R/W (Read/Write), CSO 
(Chip Select 0), and CS1 pins must first be in the 0,1,0 
State, respectively. The data is then written when at 
laast one of these pins changes state. Refer to the 
Write Timing Diagram. Writing is accomplished by 
changing preferably CSO or CS1. Following device 
power up, nominal values should be loaded Into the 
attribute registers as described below. 


Approximate Data Transfer Rate 

For speech production using the SSI 263A, the actual 
data rate depends on the amount of speech attribute 
manipulation. For example, the production of 
monotonic speech, where phoneme and duration are 
the only attribute manipulations, requires a data rate 
less than 100 bits-per-second. A higher data rate of 





about 500 bits-per-second is required for high quality 
speech due to the associated full attribute man/pulation. 


Selectable Operation Modes 

The state of the Duration/Phoeme Reglster bits DRt 
and ORO determine the operating mode of the device 
when the Control bit (CTL) is changed from a logic one 
to a logic zero. The four modes of operation Include 
choice of timing response between “frame” or 
“phoneme” timing (as explaned below), transitioned or 
Immediate inflection response, and setting the A/R 
(Acknowledge/Request Not) pin active or disabled. 
Refer to the Mode Selection Chart. 


Phoneme Selection 

The SSI 263A can produce the 64 phonemes listed on 

the Phoneme Chart. Bits P5-PO are used for phoneme 

selection. The relative phoneme duration Is set by bits 
DR1 and DRO. 


Phoneme Articulation Adjustment 

A particular phoneme is produced by the combination 
of vocal-tract low-pass filter settings, excitation source 
type, and source amplitude. When a new phoneme Is 
selected, the device performs a IInear transition to the 
new set of characteristics. The rate of this transition is 
controlled by the articulation setting, bits TR2-TRO. This 
rate is relative in that articulation Is not affected by 
speech rate bits R3-RO. A typical articulation register 
setting is ‘‘5”. 


Programmlng Inflection (Pitch) 

When the SSI 263A Is in the mode of Immediate Inflec- 
tlon, bits 111-10 provide immediate adjustment with 
seven octaves of pitch on an even tempered scale. 
With the device in the transitioned Inflection mode, bits 
110-16 select the target pitch and bits 15-13 determine 
the inflection rate of change. Bits 111, 12, 11, and 10 
always provide immediate adjustment. A typical value 
used for speech production is 90Hz where: 


XCK frequency 


8 X (4096-1) 
| = decimal equivalent of Inflection Register setting 


Inflection Frequency = 


Filter Frequency Setting 
Data bits FF7-FFO set the clock frequency for the 
switched-capacitor vocal tract filters. This determines 
overall filter frequency response. Inflection pitch Is not 
affected by these bits. Typically thls is set to give a 
clock frequency of about 20KHz (see formula below), 
but may be manipulated to fine-tune speech quality or 
to change “voice type”; bass, baritone, etc. 

XCK frequency 


2 (256 - FF) 


FF = decimal equivalent to the Filter Frequency 
Register setting. 


Filter Frequency 


Speech Rate 
Rate of speech is controlled by bits R3-RO, the Speech 


Rate Register. In Frame Timing Mode new attribute 
data Is requested at the end of a “frame” where: 
. 4096 X (16-R) 


XCK frequency 

R = decimal equivalent of Rate Register setting 
In the Phoneme Timing Mode the frame duration is 
modified by the phoneme duration bits DR1 and DRO 
where: 

Phoneme Duration = (Frame Duration) X (4-D) 

D = decimal equivalent of Duration Register setting 
All Internal attribute transitioning is performed relative 
to the Speech Rate Register setting. Speech rate does 
not effect Inflection or filter frequency. A typical rate 
setting |s hexadecimal ‘‘A”. 


Frame Duration 


Amplitude Adjustment 

The overall Audio Output level is set with register bits 
A3-AQ. Slnce each phoneme has a preset amplitude 
relative to other phonemes, It is not necessary to pro- 
gram the amplitude of each phoneme; however, ampll- 
tude changes may be used to enhance the speech 
_quallty and add emphasis. Amplitude is transitioned 
IInearly at rate dependent on the phoneme duration 
setting. A typical amplitude setting is hexadecimai"C”. 


Control Bit and Power Down Mode 
Setting the Control bit (CTL) to a logic one puts the 
device Into Power Down mode, a sort of “standby”. 
This bit Is also set high when the PO/RST pin Is 
brought low and also upon power up. The Power Down 
mode turns off the excitatlon sources and analog cir- 
cults to reduce power consumption, but maintains the 
present register settings. Upon a Control bit logic one- 
to-zero transitlon, the present settings of DR1 and DRO 
determine the operation mode as described above. 


Register Reading 

Device pin D7? becomes an output, as the Inverted state 
of AJR, when the device is put into Read (R/W Is a logic 
1 and the chip Is selected, CS1=0, CSO=1). Refer to 
the Read Timing Diagram. The register address bits are 
Ignored. 


Time Base 

Many different time bases may be utilized (see external 
clock Input specifications). It is desirable to establish a 
stable crystal controlled time base from 800 to 
1000KHz when DIV2 Is set low, or twice the frequency 
when DIV2 Is set high. A good time base can be easily 
accomplished with an Inexpensive colorburst 3.5795 
MHz crystal In conjunction with a divide-by-two circult. 
The actual device timing and output frequencies are 
directly related to the time base frequency used. 


Microprocessor Interfacing 

Either the A/R line, or D7 as an output, are used as an 
Interrupt to indicate when the duration of a frame or 
phoneme has been exceeded. No detectable degrada- 
tlon to speech quality results when several milllsec- 
onds occur between data request and load. © 


PHONEME CHART 
Hex Code* Phoneme Symbol Example Word (or Usage) 


























OPA (payee) 
01 E MEET 
02 El BENT 
03 Y BEFORE 
04 7 YEAR 
05 AY PLEASE 
06 IE ANY 
07 | SIX 
08 A MADE 
09 Al CARE 
OA EH NEST 
0B EH1 BELT 
oc AE DAD 
oD AE1 AFTER 
OE AH GOT 
OF AH1 FATHEA 
10 AW OFFICE 
11 Oo STORE 
12 OU BOAT 
13 OO LOOK 
14 IU YOu 
15 11 COULD 
16 U TUNE 
7 Ut CARTOON —_ 
18 UH WONDER 
19 UHI LOVE 
1A UH2 WHAT ~~ 
1B UH3 NUT — 
1c ER BIRD a 
1D R ROOF a 
1£ Ri RUG 
1F R2 MUTTER (German) 
20 L LIFT 
21 u1 PLAY 
22 LF FALL (fInal) 
23 WwW WATER 
24 B BAG 
25 D PAID 
26 KV TAG (glottal stop) 
27 Pp PEN 
28 T TART 
29 K KIT 
2A HV (hold vocal) 
2B HVC (hold vocal closure) 
2C HF HEART 
2D HFC (hold fricative closure) 
2E HN (hold nasal) 
2F Z ZERO 
30 S SAME 
31 J MEASURE 
32 SCH SHIP 
33 V VERY 
34 F FOUR 
35 THV THERE 
36 TH WITH 
37 M MORE 
38 N NINE 
39 NG RANG 
3A iA MARCHEN (German) 
3B 70H LOWE (French) 
3c 1U FUNF (German) 
3D “UH MENU (French) 
3E E2 BITTE (German) 
3F LB LUBE 


*Note — Hex codes shown with ORO, OA1 = O (longest Duration) 


SS] 263A 





PIN ASSIGNMENT DESCRIPTIONS 









Description 


1 AO Analog Audio Output biased 
@ VDD/2 requires an 
external audio amp for 
Speaker drive 


“z_[aeNo [| Analog Ground 
Ta [ter | [oonotuse 


Acknowledge/Request Not 
— open collector output 
changes from high to low 
level after phoneme Is 
generated. May be used as 
an interrupt request for new 
phoneme data. (See Pin 17 
























Do not use 


Register Select Input — used 
to select one of five Internal 










RS1 and RSO 


ac 
a [AS0_| | Resistor Select Seo pin 6) 


Po Ta 
input only 

[10 [bt |_| Batainpur fon) 
Tar [2 [| data input ony) 
“2 [Bend || Digital Ground 
| 13 | ps | | Data Input (only) 









REGISTER INPUT FORMATS 


Register Name 


registers in conjunction with 







/ 





MSB of 8-bit data bus. Bi- 
directional, Inverse of pin 4 
when read is high 








Power Down Control Input — 
Silences audio output and 
retains DC blas without 
disturbing register contents. 
Disables A/R output. 


Read/Write Control Input — 
Write is active low for load- 
| ing internal registers. Read is 
active high but enables D7 
only. 


Clock Input (11 or 2 MHz) 


Clock Divide by Two — used 
when external clock Is = 
2 MHz 


































Bus Input Bit Position 


rele te |e |m] a 
car prop arp | a | 0 








F5 |) Fé 





DR1,DRO ..Deflne the phoneme duration. 
P5-—P0O ...Address the phoneme required. 


I11-~ 10.... Define inflection target frequencies 
and rate of change. 
R3--RO ...Define the rate or speed of speech. 


CTL....... Define the mode of A/R response In 
conjunction with DR1 and DRO. 
‘Also directly set by PD/RST. 





z 


T2——-T0....Define the rate of movement of the formant 
position for articulation purposes. 

A3-~A0 ...Define the amplitude of the output audio. 

F7-~FO ...Define the frequency of all vocal tract 


filters. 
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WRITE TIMING DIAGRAM READ TIMING DIAGRAM 
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*Valid dala latched on first rlae or fall of RAW, CSO or CS1 Into inactive. 
Timing Characteristics (Vpp = 4.5 to §.5 Volts, TA =-40 to +85 deg. C) 













Data Seuptime Sse 
[DataHold Time SSCS | 
Strobe Width ws [200 
ReadWrite Cycle Time )  oTRW | 225° | 


Notes: * Based on color burst fraquency. 
** Timing relative to deselect by either CSO, CS1, or RWW changing. 









MODE SELECTION CHART 











DRi | DRO |  ‘CTL’BIT Functlon 

HI HI HI--LO AIR actlve; phoneme timing response; transitioned Inflection (most 
commonly used mode) 

HI Hi--LO AIR active; phoneme timing response; immediate Inflection 

LO HI--LO A/R active; frame timing response; immediate inflection 

LO | LO | HILO Disables A/R output only; does not change previous A/R response 


ABSOLUTE MAXIMUM RATINGS 


—0.5 to Vpp + 0.5 





Operating Temperature —40 to + 85 
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Electrical Characteristics Unless otherwise specified, 4.5 <Vpp <5.5; —40 deg. C< TA < 85 deg. C; 
1.50MHz S$ XCK frequency < 2.0MHz, when XCK/2 = logic 1 or 
0.75MHzs XCK frequency <= 1.0MHz, when XCK/2 = logic 0 


POWER SUPPLY 


Supply Current PD/RST =0, CTL=1 


AUDIO OUTPUT 


Output Level AW phoneme . 
RL = 50Kohm to GND through 1uF cap. 


Resistive Loading AC coupled to AO to GND 
Capacitive Loading To GND to ensure Stable A ee a 


| “Description | Conditions | Symbol | Min [| Typ | Max | units | Min Typ | Max | units | 


BUS CONTROL INPUTS, DATA INPUTS (RSO, RS1, RS2, CSO, CSi, D0-D7 PDI/RST) 
Vpp + 0.3 VDC 


—_ =x vo 


input Low Voitage 


Input Input Leakage Current | Input Leakage Current | VIN =9 to Vpp | IN} is en ee a 


Input Capacitance VIN =0 Ta =25°C oN 
measured at f =1.0MHz 


Input | Input Capacitance, D7 Input D7 | Input Capacitance,D7 Input} —_ 


Input Current, D7 In a = 0.4to24V oe 
TRI-State “OFF” State 


D7 OUTPUT 


D7 Output Low Voltage [ILoad=04mAlntoDb7 | VouiO7) | || |v 
Ioad=205uAoutofD7 | Vox”) | | Vop20] | voc 
AIR OUTPUT. 

oan toace ease eure [Tae 
| Output High Leakage Current High | Output High Leakage Current Current Vout=00toVoD | =0.0 to VoD | TLAVR) a ee ee 


See fag TREE Cott — eG 
f=1.0 MHz 

DIV2 INPUT 

a rr 
‘input teatage __—«dWN=O% VOD) 











0.28VDD 0.37VDD | 0.50VDD 





Vpp 
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2.0 a 





















XCLK 

‘inputtowVolage | ——SsS~SsSSIY ST] CdSC 
‘inputHigh Votage [Siw | C« OSV 
‘input Current | ‘YN=00%VpD | WO |_|) S| pa 
input Capacitance | SSCS~—SCSC SSCS 
‘bulycyia—SSC*iSCsC(‘“‘“~*S*S*dSCé CLK) CYC 








TYPICAL MICROPROCESSOR IMPLEMENTATION 
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AUDIO AMP 
(LM386) 
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3.58 MHZ 
CRYSTAL 


“| RESET 
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CONTROL BUS 


02 = 1.0MH2 
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SS] STANDARD PRODUCTS TELECOMMUNICATIONS CIRCUITS 


Part No. | Clreult Function Characteristics Package 


Tone Signaling Products 


SSI 201 Integrated DTMF Receiver Hexadecimal or binary 2-of-8 output 22DIP 
| ssi202. | Integrated DTMF Receiver Low power, hex or binary output 18 DIP 





















SSI 203 Integrated DTMF Receiver Hex or binary output, Early Detect 18 DIP 
Integrated DTMF Receiver Low-power, binary output 14 DIP 
SS1957_| Integrated DTMF Receiver | Early Detect, Dial Tone reject | 5V | 22DIP 


5V 
SSI 20089 Integrated DIMF Generator and Receiver, zP interface 5V 22 DIP 
- Transceiver 
SS! 20C390 Integrated DTMF Generator and Receiver, 4P interface, Call 5V 22 DIP 
Transceiver Progress Detect 


Call Progress Detector Detects supervision tones, Teltone second-source 









SS! K212 1200/300 Baud Modem DPSKI/FSK, single chip, autodial, Bell 212A 10V 28 DIP 
SS! 223 1200 Baud Modem FSK, HDX/FDX 10V 16 DIP 


SSI 291/213 | 1200 3aud Modem DPSK, two chips, low-pwer 7 40/16 DIP 
SSI 3522 1200 Baud Modem Filter Bel! 212 compatible, AMI second-source ~ 10V 16 DIP 


Speech Synthesis Products 


SSI 263A Speech Synthesizer Phoneme-based, low data rate, VOTRAX second- 
source 5V 24 DIP 


Switching Products 
SSI 80C50 Bell D2, D3, D4, serial format and mux, low power 


5V 
SS! 80C60 T1 Receiver Ball D2, D3, serial synchron, and demux, low power 28 DIP, Q 
| SS122100 Cross-point Switch 4x4x1, control memory, RCA second-source 











SS! 22101/2 Cross-point Switch 4x4x2, control memory, RCA second-source 
SS! 22106 Cross-point Switch 8x8x1, control memory, RCA second-source 28 DIP 


SS] 22301 PCM Line Repeater T1 carrier signal recondition 18 DIP 





No responsibility IS assumed by SSi for use of this product granted under any patents, patent rights or trademarks of 
nor for any infringements of patents and trademarks or other SSi. SSi reserves the right to make changes in 
rights of third parties resulting from its use. No license is specifications at any time and without notice. 


Printed In U.S.A. 8M IAE 9/85 
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DESCRIPTION 

The SSI 263A is a versatile, high-quality, phoneme- 
based speech synthesizer circuit contained in a single 
monolithic CMOS integrated circuit. It is designed to 
produce an audio output of unlimited vocabulary, 
music and sound effects at an extremely low data in- 
put rate. 


Speech is synthesized by combining phonemes, the 
building blocks of speech, in an appropriate sequence. 
The SSI 263A contains five eight-bit registers that allow 
software control of speech rate, pitch, pitch movement 
rate, amplitude, articulation rate, vocal tract filter 
response, and phoneme selection and duration. 
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FEATURES 

¢ Single low-power CMOS integrated circuit 

e 5 Volt supply 

e Extremely low data rate 

¢ 8-bit bus compatible with selectable handshaking 
modes 

e Non-dedicated speech, ideal for text-to-speech 
programming 

¢ Programmable and hard powerdowni/reset mode 

° Switched-capacitor-filter technology 
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SSI 263A Pin Out 
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SSI 263A Operation Description 


This short description is intended to provide SSI 263A 
feature and capability information only. Refer to the 
SSI 263A USERS GUIDE for complete information on 
application and phonetic programming. 


The Production of Speech 

To produce different speech phonemes (sounds) the 
SSI 263A uses a model of the human vocal tract. Within 
the device this analog tract is modeled with five 
cascaded programmable low pass filter sections. The 
filter sections are programmed internally by a digital 
controller. Either a glottal (pitch) ora pseudo-random 
noise source is used to excite the vocal tract, depen- 
ding on whether a voiced or non-voiced phoneme is 
selected. During speech production the phonemes will 
typically last between 25 and 100 mS. 


The Speech Attribute Registers 

Speech is produced by programming speech attribute 
(characteristic) data into five eight-bit registers. These 
internal registers allow selection of phonemes and 
speech characteristics. Refer to the Register Input 
Formats for the functional allocations. 


Device Response to Attribute Register Data 

The SSI 263A has two general classes of attribute data: 
“control” data (speech rate, filter frequency, phoneme 
articulation rate, phoneme duration, immediate inflec- 
tion setting, and inflection movement rate) and ‘‘target”’ 
data (phoneme selection, audio amplitude, and transi- 
tioned inflection). The SSI 263A responds immediately 
upon loading “contro!” data; upon loading ‘“‘target”’ 
data the device will begin to move towards that target 
at the prescribed transition rates. This fully internal 
linear transitioning between target values, done ina 
manner as is found in normal speech, is a key factor in 
reducing control data rate without sacrificing speech 
quality. 


Attribute Register Writing 

The eight bit data bus D7-D0 loads the particular 
attribute register selected by the three bit address bus 
RS2-RSO. To write the data, R/W (Read/Write), CSO 
(Chip Select 0), and CS1 pins must first be in the 0,1,0 
state, respectively. The data is then written when at 
least one of these pins changes state. Refer to the 
Write Timing Diagram. Writing is accomplished by 
changing preferably CSO or CS1. Following device 
power up, nominal values should be loaded into the 
attribute registers as described below. 


Approximate Data Transfer Rate 

For speech production using the SSI 263A, the actual 
data rate depends on the amount of speech attribute 
manipulation. For example, the production of 
monotonic speech, where phoneme and duration are 
the only attribute manipulations, requires a data rate 
less than 100 bits-per-second. A higher data rate of 





about 500 bits-per-second is required for high quality 
speech due to the associated full attribute manipulation. 


Selectable Operation Modes 

The state of the Duration/Phoeme Register bits DR1 
and DRO determine the operating mode of the device 
when the Control bit (CTL) is changed from a logic one 
to a logic zero. The four modes of operation include 
choice of timing response between “frame” or 
“phoneme” timing (as explained below), transitioned or 
immediate inflection response, and setting the A/R 
(Acknowledge/Request Not) pin active or disabled. 
Refer to the Mode Selection Chart. 


Phoneme Selection 

The SSI 263A can produce the 64 phohemes listed on 

the Phoneme Chart. Bits P5-PO are used for phoneme 

selection. The relative phoneme duration is set by bits 
DR1 and DRO. 


Phoneme Articulation Adjustment 

A particular phoneme is produced by the combination 
of vocal-tract low-pass filter settings, excitation source 
type, and source amplitude. When a new phoneme is 
selected, the device performs a linear transition to the 
new set of characteristics. The rate of this transition is 
controlled by the articulation setting, bits TR2-TRO. This 
rate is relative in that articulation is not affected by 
speech rate bits R3-R0. A typical articulation register 
setting is “5”. 


Programming Inflection (Pitch) 

When the SSI 263A is in the mode of immediate inflec- 
tion, bits 111-0 provide immediate adjustment with 
seven octaves of pitch on an even tempered scale. 
With the device in the transitioned inflection mode, bits 
110-16 select the target pitch and bits 15-13 determine 
the inflection rate of change. Bits 111, 12, 11, and 10 
always provide immediate adjustment. A typical value 
used for speech production is 90Hz where: 


XCK frequency 


8 X (4096-I) 
| = decimal equivalent of Inflection Register setting 


Inflection Frequency = 


Filter Frequency Setting 

Data bits FF7-FFO set the clock frequency for the 
switched-capacitor vocal tract filters. This determines 
overall filter frequency response. Inflection pitch is not 
affected by these bits. Typically this is set to give a 
clock frequency of about 20KHz (see formula below), 
but may be manipulated to fine-tune speech quality or 
to change ‘voice type”; bass, baritone, etc. 

XCK frequency 


2 (256 - FF) 


FF = decimal equivalent to the Filter Frequency 
Register setting. 


Filter Frequency = 


Speech Rate 
Rate of speech is controlled by bits R3-RO, the Speech 


Rate Register. In Frame Timing Mode new attribute 

data is requested at the end of a frame” where: 
4096 X (16-R) 

XCK frequency 

R = decimal equivalent of Rate Register setting 
in the Phoneme Timing Mode the frame duration is 
modified by the phoneme duration bits DR1 and DRO 
where: 

Phoneme Duration = (Frame Duration) X (4-D) 

D = decimal equivalent of Duration Register setting 
All internal attribute transitioning is performed relative 
to the Speech Rate Register setting. Speech rate does 
not effect inflection or filter frequency. A typical rate 
setting is hexadecimal ‘‘A”’. 


Amplitude Adjustment 

The overal! Audio Output level is set with register bits 
A3-A0. Since each phoneme has a preset amplitude 
relative to other phonemes, it is not necessary to pro- 
gram the amplitude of each phoneme; however, ampli- 
tude changes may be used to enhance the speech 
quality and add emphasis. Amplitude is transitioned 
linearly at rate dependent on the phoneme duration 
setting. A typical amplitude setting is hexadecimal''C”’. 


Frame Duration 


Control Bit and Power Down Mode 

Setting the Control bit (CTL) to a logic one puts the 
device into Power Down mode, a sort of “standby”. 
This bit is also set high when the PD/RST pin is 
brought low and also upon power up. The Power Down 
mode turns off the excitation sources and analog cir- 
cuits to reduce power consumption, but maintains the 
present register settings. Upon a Control bit logic one- 
to-zero transition, the present settings of DR1 and DRO 
determine the operation mode as described above. 


Register Reading . 

Device pin D7 becomes an output, as the inverted state 
of A/R, when the device is put into Read (R/W is a logic 
1 and the chip is selected, CS1 =0, CSO =1). Refer to 
the Read Timing Diagram. The register address bits are 
ignored. 


Time Base 

Many different time bases may be utilized (see external 
clock input specifications). It is desirable to establish a 
stable crystal controlled time base from 800 to 
1000KHz when DIV2 is set low, or twice the frequency 
when DIV2 is set high. A good time base can be easily 
accomplished with an inexpensive colorburst 3.5795 
MHz crystal in conjunction with a divide-by-two circuit. 
The actual device timing and output frequencies are 
directly related to the time base frequency used. 


Microprocessor Interfacing 

Either the A/R line, or D7 as an output, are used as an 
interrupt to indicate when the duration of a frame or 
phoneme has been exceeded. No detectable degrada- 
tion to speech quality results when several millisec- 
onds occur between data request and load. 


00 PA 
01 E 
02 E1 
03 Y 
04 Yl 
05 AY 
06 IE 
07 | 
08 A 
09 Al 
0A EH 
0B EH1 
oc AE 
0D AE1 
OE AH 
OF AH1 
10 AW 
11 O 
12 OU 
13 oOo 
14 IU 
15 IU1 
16 U 
17 U1 
18 UH 
19 UH1 
1A UH2 
1B UH3 
1C ER 
1D R 
1E R1 
1F R2 
20 L 
21 L1 
22 LF 
23 WwW 
24 B 
25 D 
26 KV 
27 P 
28 T 
29 K 
2A HV 
2B HVC 
2c HF 
2D HFC 
2E HN 
2F z 
30 S$ 
31 J 
32 SCH 
33 Vv 
34 F 
35 THV 
36 TH 
37 M 
38 N 
39 NG 
3A ‘A 
3B :OH 
3C 7U 
3D :UH 
3E E2 
3F LB 


PHONEME CHART 
Hex Code* Phoneme Symbol Example Word (or Usage) 


(pause) 
MEET 

BENT 
BEFORE 


MUTTER (German) 

LIFT 
PLAY 

FALL (final) 
WATER 

BAG 
PAID 

TAG (glottal stop) 
PEN 

TART 


(hold vocal) 
(hold vocal closure) 
HEART 
(hold fricative closure) 
(hold nasal) 
ZERO 
SAME 
MEASURE 
SHIP 


MARCHEN (German) 
LOWE (French) 
FUNF (German) 

MENU (French) 
BITTE (German) 

LUBE 


*Note — Hex codes shown with DRO, DR1 = 0 (longest Duration) 





PIN ASSIGNMENT DESCRIPTIONS 


Active 
Pin No.| Symbol | Level 


Description 


| Analog Audio Output biased 
| @ VpD/2 requires an 
external audio amp for 
speaker drive 

| Analog Ground 


TP1 | Do not use 

AIR Acknowledge/Request Not 
— open collector output 
changes from high to low 
level after phoneme is 
generated. May be used as 
an interrupt request for new 
phoneme data. (See Pin 17 
also.) 


5 TP2 Do not use 
6 RS2 Register Select Input — used 
to select one of five internal 


registers in conjunction with 
RS1 and RSO 


Register Select (See pin 6) 


8 RSO Register Select (See pin 6) 
LSB of 8-bit data bus — 
_ input only 


Data Input (only) 




















| DGND_ Digital Ground 
| 13 | D3 | Data Input (only) 


REGISTER INPUT FORMATS 
| Register Address 
_RS2> RS1 | RSO 
LO LO. LO _ Duration/Phoneme (DRIP) 
LO LO © HI ' Inflection (i) 
| LO | Rate/nflection (Ril) 





Register Name 


| X , Filter Frequency (F) 


DR1,DRO . .Define the phoneme duration. 
P5 ~PO ...Address the phoneme required. 
111 10... . Define inflection target frequencies 
and rate of change. 
. .Define the rate or speed of speech. 
CTL....... Define the mode of A/R response in 
conjunction with DR1 and DRO. 


Also directly set by PD/RST. 









Active 
PinNo.;| Symbol| Level Description 


D4 Data Input (only) 
Data Input (only) 












Data Input (only) 


MSB of 8-bit data bus. Bi- 
, directional, inverse of pin 4 
when read is high 


18 |PD/RST| Low | Power Down Control Input — 
Silences audio output and 
retains DC bias without 
disturbing register contents. 
Disables A/R output. 


19 CSO High | Chip Select Input 
20 CS1 Low | Chip Select Input 


Read/Write Control Input — 
Write is active low for load- 














LO H 
LO HI HI | Control/Articulation/Amplitude (C/A/A) | CTL | T2 T1 TO A3 A2 Al AO 
HI} | X 


A3-=—A0 .. 


ing internal registers. Read is 
active high but enables D7 
only. 


| 22 | XCK Clock Input (11 or 2 MHz) 
23 DIV2 | High | Clock Divide by Two — used 








when external clock is = 
2 MHz 


VDD P| Positive Voltage Supply 





Bus Input Bit Position 


R3 | R2 R1 RO 141, 12 i 10 

















F7 | Fe FS | F4 | F3 | F2 | F1 | FO | 


T2—~—T0....Define the rate of movement of the formant 


position for articulation purposes. 
. Define the amplitude of the output audio. 
F7-~FO ...Define the frequency of all vocal tract 
filters. 


WRITE TIMING DIAGRAM READ TIMING DIAGRAM 


DO- D7 lly, TRY Ulla Ci D7 ZL KL 
ro—Ts—a Taco— eet = THR 
RsORSURs? ZX —VALO XE ii, — RSOIRSURS2 77/7) Cd 
bo—-T}, | y | 
ONY TT 
soe OS "* ps 
EK KEL 
cso | 1 | cso ' | 
Si e | am nL YY 


*Valid data latched on first rise or fall of RAW, CSO or CS1 into inactive. 


Timing Characteristics (Vpp = 4.5 to 5.5 Volts, TA =-40 to +85 deg. C) 


ae 
min | Max 

Data SetupTime ———OSOSC~“~*~“~*~“‘“~*S*S*~sSCSC‘SSSCSCeo dC 
‘DataHoldTime SSCS toi 
[stobe Width —SSS~S~C~S~SWS 0 |SSSC*dT a 
[Reaamite GyoeTime ——=S~*~“~*‘“~*~*~rYC~iRWS*d:Cts | idee 
Oo 
"D7 Output Access Time —=—SSCSC~S~S~S—CTTC SY S*dCt8 Te 
"D7 OviputHoldTime —~—SCS*~S~“~*~S*~sSCSTR | S*Ct se 


Notes: " Based on color burst frequency. 
** Timing relative to deselect by either CSO, CS1, or RAW changing. 
























MODE SELECTION CHART 
| DRI DRO ‘CTL’ BIT Function 

HI HI HI=LO AIR active; phoneme timing response; transitioned inflection (most 

commonly used mode) 

HI LO HI~LO AJR active; phoneme timing response; immediate inflection 

LO HI Hi-LO AJR active; frame timing response; immediate inflection 

LO LO Hi-LO ' Disables A/R output only; does not change previous A/R response 
ABSOLUTE MAXIMUM RATINGS 


Input Voltage —0.5 to Vpp + 0.5 


D.C. Current at Inputs +1.0 


Storage Temperature —55 to +125 


Operating Temperature —40 to +85 


Power Dissipation 500 











SSI 263A 





Electrical Characteristics Uniess otherwise specified, 4.5 < Vpp <5.5; —40 deg. C< TA < 85 deg. C; 
1.50MHz < XCK frequency < 2.0MHz, when XCK/2 = logic 1 or 
0.75MHz s XCK frequency < 1.0MHz, when XCK/2 = logic 0 


Description Conditions Min. Typ. Max. | Units | 


POWER SUPPLY 





Supply Current PD/RST =0, CTL=1 


AUDIO OUTPUT ° 


Output Level AW phoneme 0.28VDD | 0.37VDD | 0.50VDD 


RL = 50Kohm to GND through 1yF cap. 


DG Output eet _ Loading AC coupled to AO to GND ee ee ee 
Capacitive Loading To GND to ensure Stable A PFE t0 |p 


Description Conditions | Symbol Min Typ Max Units 
BUS CONTROL INPUTS, DATA INPUTS (RSO, RS1, RS2, CSO, CS1, DO-D7 PD/RST) 
Input High Voltage 
















Input Low Voltage | 
Input Leakage Current VIN =0 to VoD 


Input Capacitance VIN =0 Ta =25°C 
measured at f = 1.0MHz 


input Capacitance, D7 Input Cin(D7) 


Input Current, D7 in VIN = 0.4 to 2.4 V lWN(TS) 
TRI-State OFF” State 


D7 OUTPUT 
D7 Output Low Voltage lLoad =0.4 mA into D7 VoL(D7) 
D7 Output High Voltage lLoad = 205 uA out of D7 VOH(D7) 
A/R OUTPUT 
Output Low Voltage IL =3.2 mA into A/R loL(A/R) 
Output High Leakage Current| VOut = 9.0 to Vop IL(A/R) 


Output Capacitance Vout =9 VDC TamB=25°C 
f = 1.0 MHz 





























DIV2 INPUT 
Input Low Voltage | ViL(DIV2) 


Input High Voitage | ViH(DIV2) 
Input Leakage VIN =0 to Vpp | 














Description Conditions Symbol Min. Typ. | Max. | Units. | 


XCLK 
Input Low Voltage VIH(IC) 
Input High Voltage 





Input Current VIN= 0.0 to Vpp lin(C) 
Input Capacitance Cin(C) | 10 


Duty Cycle D(XCLK) 0.4 

















TYPICAL MICROPROCESSOR IMPLEMENTATION 






AUDIO AMP 
(LM386) 


+V 


ADDR. BUS 


DATA BUS 


CONTROL BUS 


02 = 1.0MHz 


1435) Myford Road, Jl California NS 





Sit 731- tems TWX 910-595-2809 





User's Guide 


Phonetic Programming Using the SSI 263A 


Phonetics 


Every speech sound (phoneme) in any language may be 
represented by a special symbo! (phonetic symbol). These 
symbols are used in WRITING precisely the sound sequence 
(phonetic transcription) of a word according to the way it is 
pronounced. There are many different phonetic symbol sets 
(phonetic alphabets). Each would contain a minimum number of 
symbols to represent the basic sounds (phonemes) required to 
pronounce any word in the language. Additional symbols are 
usually included which represent sounds with slight to great 
variations in the-basic sounds (allophones). These symbols are 
used to assist in the transcription of words that reflect a regional, 
dialectic, or foreign pronunciation. 


The process of transcribing a spoken word into its phonetic 
components begins with identifying the number of sounds in the 
word, then tagging each with a label to specify its type. 
Consonants and vowels are the most familiar labels but these may 
be broken down into subtypes (e.g., stop consonants, back 
vowels, etc.) as the need for more specificity arises. Once the 
sounds have been identified, their symbols are selected, then 
written in sequence. The resulting transcription should allow 
another person to identify the pronunciation without having heard 
the word spoken. 


Note that when using a phonetic alphabet to transcribe words into 
their sound sequences, there is not a one-to-one correspondence 
between the alphabet characters (orthographics) used to spell 
words and the phonetic symbols (phonetics) used to represent 
their pronunciations. For example, in the word “phones” there are 
6 letters but only 4 sounds. Conversely, the word “I” has 1 letter 
but 2 sounds. It may be of some assistance to keep a dictionary 
handy for reference. Dictionaries use their own phonetic system to 
describe the pronunciations of every word entry. It will be 
necessary to learn at least one phonetic alphabet in order to 
engage in phonetic transcription. The SSI 263A Phonetic Alphabet 
is the referent used in this manual. However, if another system is 
already known, it is easily translated into the referent. 


When transcribing vocabulary from orthography (standard 
alphabet spelling) to phonetics, it is common to place the phonetic 
sequence between right siash marks when the transcription 
appears in running text. The word “phones’ for example, would be 
transcribed as /F O N Z/ when using SSI 263A phonetic symbols. 
This allows the reader easier identification of phonetic segments. 


SSI 263A Phonetic Alphabet 


The phonetic alphabet used to represent the SSI 263A phonemes 
is the SSI 263A PHONETIC ALPHABET. Refer to the Phoneme 
Chart for a complete listing of the phoneme symbols. 


Of the 64 alphanumeric symbols in the SSI 263A Phonetic 
Alphabet, 34 represent sound BASIC to the pronunciation of 
American English. The remaining 30 symbols fall into 2 groups: 
the ALLOPHONE group and the NO-SOUND group. The BASIC 
sound symbols are: 


A, AE, AH, AW, B, D, E, EH, ER, F, HF, I, J, K, KV, L, M, N, NG, O, 
OO, P, R, S, SCH, T, TH, THV, U, UH, V, W, Y, Z. 


Symbols in the ALLOPHONE group represent speech sounds that 
vary in pronunciation from one of the basic sounds. They may be 
used in transcribing words or word segments (syllables or 
morphemes) whose pronunciations are not satisfied by the basic 
phonemes alone (words rooted in a foreign language, words 
adapted by a regional dialect, etc.). The ALLOPHONE symbols 
are: 


Al, AE1, AH1, AY, E1, E2, EH1, HN, HV, IE, IU, 1U1, L1, LB, LF, 
OU, R1, R2, U1, UH1, UH2, UH, YI,:A, :OH, :U, ‘UH. 


The NO-SOUND symbols represent silent states. One of these 
symbols represents a “pause” state. It is used to separate 
phoneme sequences into phrase-like segments which assist in 
more closely imitating the natural pausing in human speech for 
breathing or for delayed emphasis. The “pause” is treated as a 
phoneme when it is selected for a transcription and will be subject 
to phoneme parameter programming. It has the ability to maintain 
the parametric levels of duration, inflection, amplitude, etc., during 
its silence, thus audibly affecting the movement of the preceding 
and following phonemes. Other NO-SOUND symbols represent 
“hold” states. They are used in combination with BASIC 
phonemes or ALLOPHONEs to generate articulation variations on 
their pronunciations. The NO-SOUND symbols are: 


HFC, HVC, PA. 


Now that there is a tool to use for writing the sounds that are 
heard, the next stage is to identify the sounds that are produced 
by the SSI 263A speech synthesizer. 


SSI 263A Phoneme Review 


Thus far in this program, it has been established that: (1) spoken 
words are made up of a series of sounds; (2) each speech sound 
in a language may be represented by an identifying symbol; and 
(3) the spoken word may be written according to its sound 
sequence using these special symbols. Before a word may be 
written phonetically, however, users may wish to study further the 
SSI 263A speech sounds. What makes one sound different from 
another and how these differences may be helpful to phonetic 
programming will be essential information for phonetic 
programmers. 


The sound that is represented by each phonetic symbol in the SSI 
263A Phonetic Alphabet must be audibly learned. The easiest 
way to approach this task is to start with the sounds already 
known and associate a symbol with them. For example, from 
spelling we have already learned that vowels may be “long” or 
“short” and are often differentiated by their particular spelling 
formats. Every time a word with a “short a” sound is heard (sat, 
fat, cat, bat, happy, plaster, ankle, Saturday, amplify, contaminate, 
etc.) the symbol /AE/ should come to mind. A “long a” sound (fate, 
State, bait, lace, maybe, stable, arrangement, etc.) is actually a 
diphthong (two sounds combined into a single unit) and may be 
represented by the symbols /A AY/. 


In standard orthography, there are only 5 vowel letters to 
represent 17 vowel sounds. !n phonetics, each vowel sound will be 
represented by its own symbol or symbol combination. 


Again, from spelling, we have learned that the letter “c” may have 
a hard sound as in “cat” or a soft sound as in “city:’ The hard 
sound is actually a /K/ as in “kite” and the soft sound is an /S/ as 
in “sing:’ Users must identify which sound (/K/ or /S/) is used in 
the transcription of a “c”’ You will not find a symbol C in a phonetic 
alphabet. Like “C7 the letters “Q” and “X” will not be found in 
phonetic alphabets. They are transcribed into the sound 
sequences /K W/ and /K PA S/. Refer to the Phoneme Chart 
during this learning period. It provides example words to describe 
the pronunciations corresponding to each symbol. 


Users may add more words to the examples above to continue 
identifying the symbol-sound relationship for /AE/ and /A AY/. 
Follow this technique for each symbol in the alphabet. For 
auditory verification, enter the sound that is being reviewed into 
the device. Speak aloud your example word for the SSI 263A 


sound in an attempt to match that which the synthesizer is 
emitting. 


Example: /E/ = “long e” vowel sound 
= meat, read, need, repair, before, phoneme, 
erase, brief, people, timeliness, seniority, 
receive, catastrophe. 
Example: /F/ = “voiceless fricative” consonant 
= farm, false, aft, feet, finger, phrase, phone, 
Africa, alphabet, cough. 


Once you have reviewed auditorily the sounds you already have 
a familiarity with from spelling, proceed to the BASIC sound list in 
the above text and continue the review. Be aware that several 
consonant sounds will not provide output unless they have 
another sound following. This is the case with /B/, /D/, /P/, /T/, and 
/K/. When one of these sounds is entered into the SSI 263A, 
follow it by a vowel and listen to both in sequence. 


Users who already have a familiarity with phonetics and SS! 263A 
synthetic sounds, may wish to follow the sound review procedures 
in order to auditorily determine the difference between two sounds 
or identify new ones. For example, enter the /UH/ phoneme into 
the device. Then enter /UH1/, /UH2/, and /UH3/. Listen to each 
sound noting the pronunciation variations. Be aware that there are 
no duplicate sounds resident on the SSI 263A chip. 


Whenever a SS! 263A sound is audited that cannot be readily 
identified as to its appropriate usage, do not be concerned. The 
review is designed only to provide a method for establishing an 
auditory memory for each sound and a visual memory for its 
symbol. Phonetic programming may begin anytime after the initial 
review. Return to the review later as your familiarity with the 
BASIC sounds increases and as your need for sound alternatives 
to those BASIC sounds becomes more apparent. 


If there is a question as to which symbols should be chosen to 
transcribe a word into its sound sequence, make a written note of 
the word by circling the letter(s) that present the problem. Later, 
when phonetic programming has begun, a phoneme sequence 
may be created for the word and users may verify auditorily which 
phonetic selection produces the most appropriate translation. 


SSI 263A Phoneme Discussion 


The SS! 263A Phonetic Alphabet is divided into 3 groups for the 
purpose of differentiating between phonemes and allophones. 
Another way of dividing the Alphabet is according to usage. The 
most familiar division is a two sections split: CONSONANT 

sounds and VOWEL sounds. Within each of these sections, 
sounds may be further subdivided according to the distinctive 
features that best describe the sounds phonetically or 

acoustically. The more that is known about a sound, the easier it 

is to determine how it may be used in transcribing and phonetically 
programming a word. 


Consonant Sounds 


There are 22 Consonant Phonemes, subdivided according to their 
manner of production in the human speech mechanism. Some are 
characterized by the noise emitted when the articulators obstruct 
the air flow (Fricatives like /S/). Vowel-like consonants have the 
least amount of obstruction and may occasionally be used as a 
vowel substitute. Stop consonants are obstructed completely, 
release of air flow occuring at the onset of the next sound. Notice 
that Affricates are a sequence of 2 sounds (a Stop followed by a 
Fricative) spoken as a single unit. Unlike vowels, which always 
have a vocal source during production, consonants may be voiced 
(V) or unvoiced (U) (no vocal source during air flow). When 
listening to the manner in which a consonant is produced during 
speech, note its special characteristics that distinguish it from all 
other consonants. The figure below displays all of the consonant 
sounds within their production groups. 


Stops Fricatives Affricates 
Voiced B, D, KV Z, V, J, THV (D, J) 
Voiceless P,T, K S, F, SCH, (T, SCH) 


TH, HF 


Seml-vowels Glides Nasals 
Voiced R,L W, Y M, N, NG 
Voiceless 
Consonant Chart 


Voiced and voiceless consonants are subdivided into 6 
categories according to the manner in which they are 
produced in the human vocal tract: i.e., how the air flow 
is obstructed by the articulators to make each sound 
different. 


Consonant sounds are selected for a sequence in much the same 
manner as an alphabet character would be selected for the 
spelling of a word. Users must be alert, however, to identify the 
exceptions. Occasionally, a consonant appears in the spelling of a 
word but not in its sound sequence: the “b” in “comb” is not 
pronounced and the sound sequence reflects the absence of the 
“b”: /K OU M/. Some exceptions have grammatical rules that may 
be used in determining the appropriate sound. For example, a 
consonant may have 2 pronunciations according to its sound 
environment. The “s” used to pluralize the two words that follow 
are pronounced differently based on whether the sound that 
precedes it is voiced or unvoiced. An “s” pronunciation will match 
the voicing characteristics of the sound it follows. 


Examples: tips = /TtP S/ 
tabs = /T AEB Z/ 


There are other types of consonantal exceptions. For example, 
the “t” in a word like “nation” is pronounced /SH/ and the program 
might look like this: nation = /N A AY SH UH3 N/. Users must 
listen to each word's pronunciation to determine the appropriate 
phoneme selection. 


There are 7 Consonant Allophones, each noted in the table below. 
The /L/ consonant is used in the initial position of a sequence for 
words beginning with “L’, while the /LF/ allophone will occupy a 
medial or final position in a sequence: e.g., lull = /L UH LF/. The 
/LB/ and the /LI/ allophones would be used when a most 
constricted pronunciation of an “tC was required, as would occur 
for some words of foreign languages. 


Consonant Consonant Consonant Vowel 
Phoneme Allophones Phoneme Allophone 
L L1, LB, LF R ER 
R R1, R2 Y Yl 


Allophone Listing for /L/, /R/, & /Y/ 


The /R/ is an initial position phoneme. Both /R1/ and /R2/ have 
more constricted pronunciations than /R/ and may be used in 
sequence with soundiess interrupts to create a trilled /R/. Often 
when the /R/ is required in a medial or final position, it is vowelized 
and the /ER/ is used. Listening to the production of all four of 
these sounds will auditorily show that they may, occasionally, be 
used interchangeably. 


Examples: red = /R EH D/ 
bird = /B ER D/ 
motor = /M OU T ER/ 


The /Y/ consonant, used as the final sound in words ending with 
“y;’ has a vowel allophone that may be used as the initial sound of 
words starting with “y:’ Note that both /Y/ and /Y1/ are auditorily 
very close to the /E/ and the /IE/ vowels and may be considered 
interchangeable. 


Vowel Sounds 


There are 12 BASIC Vowel Phonemes. Vowels are subdivided 
according to the manner in which they are produced. All vowels 
are voiced sounds but each has a different output based on the 
degree of obstruction created by the opening of the mouth and the 
tongue position. Lip positions, another obstructing articulator, may 
range from spread flat to round. While the lips are in any of these 
positions, the jaw may be simultaneously dropped from a closed 
to an open position. 





Vowel Quadrilateral 


Vowels begin their production with the same voiced 
energy. Changes in the position of the tongue (front or 
back), the shape of the lips (from spread flat to 
rounded), and the position of the lower jaw (from closed 
to open) determine the final characteristics that allow 
listeners to distinguish between vowel sounds. 


Refer to the SSI 263A Phoneme Chart for the pronunciation 
reference on each BASIC vowel sound. Utilize the sound review 
techniques on the previous pages to practice identifying the vowel 
sounds in words and associating them with their phonetic 
symbols. 


The allophonic variations of vowels, 20 in number, are used in a 
phonetic program to enhance the pronunciation of a word. There 
are some cases where the allophone is required for articulate 
pronunciations. This is true for /AY/, /YI/ and /IU/, which are 
integral components in the phonetic sequences for the “long a” 
and the varied “long u’ 


Examples: same = /S A AY M/ 
you = /YI IU U/ 


The table below places each allophone into the vowel 
quadrilateral to demonstrate approximately how they might relate 
to the BASIC vowels. Users are in no way restricted to traditional 
phonetic transcriptions that use only the BASIC vowel phonemes. 
Be encouraged to experiment with allophones. Place them in 
different positions in a sequence to auditorily check how they 
effect the overall pronunciation of a word. 


| | Front Vowels Medial Vowels Back Vowels | 
| | Spread = Rounded _| 


Allophone Placement In Vowel Quadrilateral 


Vowel allophones are placed in the vowel quadrilateral 
according to their production features. The sounds they 
emit vary slightly from the BASIC vowels that occupy the 
same positions. 













Four vowel allophones—/:A/, /:OH/, /:U/, and /:UH/— are adapted 
pronunciations of four of the BASIC vowels. These sounds are 
most commonly used for phonetically programming a foreign 
word. They may also be used as transitory sounds to link 
phonemes with opposite production features such as a round, 
open vowel with a very constricted, narrow consonant. 


There are five vowels that require two or more vowel sounds in 
sequence in order to achieve their pronunciations. These are 
generally referred to as diphthongs. Refer to the Diphthong 
Conversion Chart. 


The vowel quadrilateral is a handy tool to use for selecting vowel 
phonemes for diphthongs and other multi-phoneme units. For 
example, the diphthong in the word “!” starts with an /AH/ and 
ends with an /E/. In order to move smoothly from the first sound to 
the second (transition), another vowel may be inserted between 
these two sounds in sequence. The most likely choice would be a 
vowel that falls somewhere between /AH/ and /E/ in the 
quadrilateral: e.g., /UH/, /EH/, /i/, etc. The sequence may look like 


this: /AH EH E/ or /AH1 UH IE/ or /AHt EHS AY/. In their fullest 
durations, a three-sound sequence would over articulate the 
diphthong. Shortening the first and last sounds by 1 duration and 
the medial sound by 2 durations will produce a more acceptable 
pronunciation (see SSI 263A Phoneme Parameters). 


SS1 263A Phoneme Parameters (Attributes) 


To achieve an accurate pronunciation of a word produced by the 
SSI 263A synthesizer requires more than a selection of the 
appropriate phonemes. Like human speech sounds, synthesized 
sounds are further defined by the rate at which they are emitted 
(duration), the level of pitch at which they are emitted (inflection or 
frequency), and the intensity with which they are produced 
(amplitude). These are considered the three major speech 
parameters which give the overall production of a word its 
linguistic character, transforming simple speech into more 
complex language. Inflection, amplitude, and duration are only 
three of the parameters that users have control of during the 
programming process. The rate at which one sound moves into 
another (articulation) is also a controllable parameter. Other 
parameters are: the slope of the inflection (slope), the rate of each 
selected duration (rate), and the extended inflection frequencies 
(extension). Users may also select the base frequency at which 
speech may be produced (filter frequency). Refer to SSI 263A 
Phoneme Parameters, for the range of each and typical default 
values selected. 


Every phoneme selected for a sequence must be accompanied by 
assignments for each of the eight parameters. As users become 
more aware of their need to create different language effects with 
their synthesized speech output, they will require the flexibility and 
choice that comes with programmable parameters. For example, 
with 4 selectable durations per phoneme, each actual 
pronunciation of each sound may be changed. Thus, every sound 
has four possible outputs increasing the users’ choice from 64 
phonemes and allophones to 256. Each of the 256 may be 
effected differently by each of the 32 possible inflection 
assignments. Add to these possibilities 16 variations in amplitude 
and 16 variations in rate. The possible combinations are not 
limitless, of course, but they are very great and users are 
encouraged to experiment with as many as possible. 


Several of the parameters effect synthetic speech output as a 
whole. These are articulation, pitch extension, and filter frequency. 
Users may select a single level at which to set the filter frequency, 
for example, and maintain that level throughout the programming 
process. 


Phonetic Programming Methodology 


Due to the great variety of phonemes and parameter choices, as 
well as the different effects the parameter selections have on the 
speech sounds, a systematic approach to selecting the variables 
is advised. The approach described below is only one of several 
that might be used. It may be adjusted to accommodate the user's 
special programming style or to accommodate later 
implementation of automatic control techniques. 


The first step is to transcribe the target word, phrase, etc., into its 
basic phonetic components. Next, enter these sounds into the SSI 
263A and auditorily check the output. Use the default values 
suggested in the Nominal Phoneme Parameter Table. The results 
should be a bit stilted if not misarticulated for the first trial 
program. Phoneme adjustment is next. Continue to make changes 
in the phoneme sequence, auditorily monitoring the changes, until 
an adequate pronunciation of the target is established. 


Begin parameter adjustments. First, maintain articulation, pitch 
extension and filter frequency at nominal values. The device 
should be kept in the transitioned inflection mode. Make 
adjustments in the levels of only one of the remaining 4 
parameters at a time, beginning with the duration and moving on 
to the inflection, rate, and amplitude (in that order) once the 
specific effect that the parameter can make has been made. 
Return to a previously adjusted parameter at any time based on 
need. 


PHONEME CHART 


Hex Code* Phoneme Symbol Example Word (or Usage) 


00 PA 
01 E 
02 E1 
03 Y 
04 Yl 
05 AY 
06 IE 
07 | 
08 A 
09 Al 
0A EH 
0B EH1 
0c AE 
OD AE1 
OE AH 
OF AH1 
10 AW 
11 Oo 
12 OU 
13 Ooo 
14 IU 
15 1U1 
16 U 
17 U1 
18 UH 
19 UH1 
1A UH2 
1B UH3 
1C ER 
1D R 
1E R1 
1F R2 
20 L 
21 L1 
22 LF 
23 WwW 
24 B 
25 D 
26 KV 
27 P 
28 T 
29 K - 
2A HV 
2B HVC 
2c HF 
2D HFC 
2E HN 
2F Zz 
30 Ss 
31 J 
32 SCH 
33 Vv 
34 F 
35 THV 
36 TH 
37 M 
38 N 
39 NG 
3A ‘A 
3B :0H 
3C :U 
3D (UH 
3E E2 
3F LB 


*Note — Hex codes shown with DRO, DR1 = 0 (longest Duration) 


(pause) 
MEET 

BENT 
BEFORE 

YEAR 
PLEASE 

ANY 
SIX 

MADE 
CARE 

NEST 
BELT 

DAD 
AFTER 

GOT 
FATHER 

OFFICE 
STORE 

BOAT 
LOOK 

YOU 
COULD 

TUNE 
CARTOON 

WONDER 
LOVE 

WHAT 
NUT 

BIRD 
ROOF 

RUG 
MUTTER (German) 

LIFT 
PLAY 

FALL (final) 
WATER 

BAG 
PAID 

TAG (glottal stop) 
PEN 

TART 
KIT 

(hold vocal) 

(hold vocal closure) 

HEART 
(hold fricative closure) 

(hold nasal) 

ZERO 

SAME 
MEASURE 

SHIP 
VERY 


MARCHEN (German) 


LOWE (French) 

FUNF (German) 
MENU (French) 

BITTE (German) 
LUBE 


SSI 263A Diphthong Conversion Chart 


Phoneme Sequence 
A AYY 

A IE EH1 UH3 LF 
AH1 AE1 EH1 Y 


AH1 EH1 IE AW UH3 LF 


AH1 EH1 IE UH3 ER 
UH3 AH1 Y 

OU 

OU OO 

AH1 AW O U 

UH3 AH1 O U 

O UH1 AH | IE 

O UH3 EH | OO LF 
Ww uUU 

Yi luuu 


Example Words 
rain, became, stay 
mail, hale, avail 
time, rhyme, sky 
smile, style, while 
fire, liar, inspire 
mice, right, sniper 
road, stone, lower 
tore, four, floor 
loud, flower, hour 
house, about, ouch 
boy, noise, annoy 
boil, spoil, doily 
tune, spoon, do 
you, few, music 


SSI 263A Multi-Unit Conversion Chart 


Phoneme Sequence 
T HFC SCH 

KV HVC HF 

DJ 

KV HF HFC 

P HF 

K HF W 

T HF 

HFC K HF HVC S 


Example Words 
church, latch 
good, lag, angry 
just, ledge, wage 
lake, corn, check 
pipe, pay, poor 
quest, quick, aqua 
top, trip, strain 
six, exit, taxi 


Nominal Phoneme Parameter Table 
(Suggested Default Values for Speech Development) 


Amplitude (A3 — AO) 


Range—0 to F (softest to loudest, 0 = silent) 


Default—C 


Exceptions—KV = 0,8 =D=6 


Duration (OR1, DRO) 


Range— 3 to 0 (shortest to longest) 


Default—0O 


Filter Frequency Range (F7 — FO) 


Range—00 to FF (lowest to highest) 


Default—E9 


Inflection (Pitch) (110 — I6, Transitioned Inflection 


Mode Only) 


Range—0 to 1F (lowest to highest, 0 = silent) 


Default—04 


Extension and Range of Pitch (111, 12 — 10) 
Range—0 to 7 (low); 8 to F (high) 


Default Value—8 


Rate of Speech (R3 — RO) 


Range—0 to F (slowest to fastest) 


Default—A 


Slope of Inflection (16 — 13, Transitioned Inflection 


Range—0 to 7 
Default—O 


Articulation (Rate of) (A3 — A0) 
Range—0 to 7 (slow to fast) 


Default — 5 


Mode Only) 


Example of Using Phonetic Programming Methodology: 


Developing “Hello” 


Phoneme Parameters 
Pho.D Tin—S A R E FF 


KEY: 


SSI 263 Register Data 
DP IS RE TA FF 


Pho = Phoneme 


D = Duration 

T = Articulation 

In = Inflection 

S = Slope of Inflection 

A = Amplitude 

R = Rate 

E = Extension and Range of Pitch 


FF = Filter Frequency 


DP = Duration/Phoneme Register Address 000 
1S = Inflection/Slope Register 001 
RE = Rate/Extension Register 010 
TA = Articulation/Amplitude Register 011 
FF = Filter Frequency Register 1XX 
1. Original Phoneme Entry: 
Pho.D T In-S A R E FF DP IS RE TA 
PA_.05 OA-0 C A 8 E9 oo 50 A8 5C 
PA 05 OA-0 C A 8 £9 00 50 A8 5C 
HF .05 OA-0 C A 8 E9 2C 50 A8 5C 
EH 05 OAO0 C A 8 E9 OA 50 A8 5C 
L 05 OA0 CA 8 £9 20 50 A8 SC 
O 05 OA-0 C A 8 £9 11. 50 A8 5C 
PA .05 OA-0 C A 8 E9 00 50 A8 5C 
PA 05 OA-O0 C A 8 E9 0O 50 A8 S5C 
2. Phoneme Selection Refinement 
Pho.D T In-S A R E FF DP IS RE TA 
PA _.05 OA-0 C A 8 E9 00 50 A8 5C 
PA .05 OA-O0 C A 8 E9 oo 50 A8 5C 
HF 05 OA-0 C A 8 £9 2C 50 A8 5C 
EH 05 OA-0O C A 8 €9 OA 50 A8 5C 
UH3.0 5 OA-0 C A 8 E9 1B 50 AB 5C 
LF O05 OA-0 C A 8 E9 22 50 A8 5C 
UH3.0 5 OA-0 C A 8 EQ 1B 50 A8 SC 
O 05 OAO0 C A 8 E9 11. 50 A8 SC 
OU 05 OA-O C A 8 E9 12. 50 A8 5C 
U 05 OAO0 C A 8 E9 16 50 A8 5C 
PA 05 OA-0 C A 8 E9 00 50 A8 5C 
PA_.05 OA-0 C A 8 £9 00 50 A8 5C 
3. Duration Adjustment 
Pho.D T In-S A R E FF DP IS RE TA 
PA 05 OA-O0 C A 8 E9 00 50 A8 5C 
PA 05 OA-O0 C A 8 E9 00 50 A8 SC 
HF .15 OA-O0O C A 8 E9 6C 50 A8 5C 
EH 05 OA-0 C A 8 €E9 OA 50 A8 5C 
UH3 .2 5 OA-0 C A 8 E9 9B 50 A8 5C 
LF O05 OA-O0 C A 8 E9 22 50 A8 5C 
UH3.2 5 OA-0O C A 8 E9 9B 50 A8 5C 
O 25 OAO0 C A 8 E9 91 50 A8 5C 
OU 05 OA-0 C A 8 E9 12 50 A8 5C 
U 35 OAO0 C A 8 E9 D6 50 A8 5C 
PA 05 OA-O0 C A 8 E9 00 50 A8 5C 
PA 05 OA-0 C A 8 E9 00 50 A8 5C 
4. Phoneme and Duration Adjustment 
Pho.D T In-S A R E FF OP IS RE TA 
PA 05 OA-O0 C A 8 E9 00 50 A8 5C 
PA .05 OA-O C A 8 £9 00 50 A8 5C 
HF 15 OQOA-0 C A 8 E9 6C 50 A8 5C 
EH1.15 OA-O C A 8 E9 4B 50 A8 5C 
UH3 .25 OA-0 C A 8 E9 9B 50 A8 5C 
LF 05 OA-O0 C A 8 E9 22 50 A8 5C 
UH3 .25 OA-0 C A 8 E9 9B 50 A8 5C 
O 25 OAO0 C A 8 E9 91 50 A8 SC 


QU_.0 
U2 
PA __.0 
PA_.O 


QIO|aQIO 
>| >| >| > 


5. Inflection Adjustment 


Pho.D 
PA 
PA 
HF 
EH1. 
UH3. 
LF 
UH3. 
O 

OU 


Ololhloliyliyloliy[—=[4lOlo 


OVO | O1}.01] O1 | O11 01 [1 101] G1] 01 | 0 | ef 


In-S 
0B-0 
0B-0 
OA-0 
08-0 
09-0 
08-0 
05-0 
05-0 
06-0 
07-0 
0A-0 
0B-0 


QYHOLQINJO/VOLOJO/O/OlO (| > 
P|P/ P| P| >| >| P| P| Pl >| >|} | yD 
0 | 00 | 09} GD} @p | G9 | GD | | Co }a0/G0 co Mm 


00/00/00 | 


6. Phoneme, Duration, Infiection, and Rate Adjustment 


Pho.D 
PA 
PA 
HF 
EH1. 
UH3. 
LF 
UHS3 . 
O 

OU 
U 

U 

PA 
PA 


O]O|H | /0 [iv [Oly |= |= [Olo 


01070111 [On| On fon | | | 1 Oo | On | 


In-S 
0B-0 
0OB-0 
0A-0 
08-0 
09-0 
08-0 
05-0 
05-0 
06-0 
07-0 
OA-0 
OB-0 
0A-0 


A_R 


eel elleleleleleleleleilele) 
PIPINIO|P[O/O/O/QIO IN| >| > 


1&1 | C1 | O| 11M) oO /oO/@ioaim 


FF 
EQ 
EQ 
E9 
E9 
EQ 
E9 
E9 
E9 
E9 
E9 
E9 
E9 
E9 


DP 
00 
00 
6C 
4B 
9B 
22 
98 
91 

12 
96 
D6 
00 
00 


IS 
58 
58 
50 
40 
48 
40 
28 
28 
30 
38 
50 
58 
50 


RE 
A8 
A8 
78 
D8 
C8 
C8 
98 


TA 
5C 
5C 
5C 
5C 
5C 
5C 
5C 


7. Phoneme, Duration, Inflection, Rate, and Amplitude 


Adjustment 
Pho.D T_ In-S 
PA _.0 5 O0B-0 
PA_.0 5 0QB-0 
EH .0 5 07-0 
HF .15 0OA-0 
EHi.1 5 08-0 
UH3 .2 5 09-0 
LF 05 08-0 
UH3.2 5 05-0 
O 25 05-0 
OU 05 06-0 
U 25 07-0 
U = 35 OA-0 
PA .05 0B-0 
PA .05 OA-0 


> 
a 


OLOJO/S|O/OlO/S [S/O] S/O 1O/O 
PIS|INO/P/O/O/OLO/OIN[O]/> | > 
0 | QD | > | CD | GD | GD | C0 | G9 | CD | GD | Co | | CoCo 


FF 
E9 
E9 


DP 
00 
00 
OA 
6C 
4B 
9B 
22 
98 
91 

12 
96 
D6 
00 
00 


IS 
58 
58 
38 
50 
40 
48 
40 
28 
28 
30 
38 
50 
58 
50 


RE 
A8 
A8 
D8 
78 
D8 
C8 
C8 
98 
98 
A8 
C8 
78 
A8 
A8 


8. Further Adjustment (depending on personal preference) 


Pho.D 


Cc 

L 

a) 
O/O|/fo/= Iho | JO liv |= /> |[Oolo 


O11) On] OO (On on (ot on | | 01 1 On On | ad 


In-S 
OD-0 
OD-0 
OD-0 
07-0 
09-2 
09-4 
09-0 
07-7 
06-4 
05-2 
06-3 
07-4 
05-4 
01-4 


QOS OOOO > 1H lO [M19 |OLO | > 
QIONJO/Al YS 10/0 /O1O|/O|@!O|H | > 


{OO / Oo (OD DODO /@/O/@\/alioim 


FF 
E9 


DP 


IS 
68 
68 
68 
38 
4A 
4C 
48 
3F 
34 
2A 
33 
3C 
2C 
oc 


RE 
A8& 


TA 
5C 


FF 
EQ 


